LLM extension: add ethosu 8w16a and quantize scope plumbing#19876
LLM extension: add ethosu 8w16a and quantize scope plumbing#19876xingguo01 wants to merge 1 commit into
Conversation
- adds the `ethosu_8w16a` PT2E quantization mode - introduces shared `quantization.quantize_scope` handling for Arm backends - wires the Arm quantize scope through the LLM export path - passes Ethos-U system config and memory mode through the partitioner setup Signed-off-by: Xingguo Li <xingguo.li@arm.com> Change-Id: I3f8446a20fb63670520a6b35484669d8df6f31bf
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19876
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled Job, 4 Unrelated FailuresAs of commit e079d39 with merge base 88faab2 ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Pull request overview
This PR extends the LLM export PT2E quantization plumbing for Arm backends by adding a new Ethos-U quantization mode and introducing shared “quantize scope” handling across TOSA/Ethos-U/VGF, along with passing Ethos-U compiler flags through the export/partitioner setup.
Changes:
- Added PT2E quantization mode
ethosu_16a8wand wired it through the LLM export path. - Introduced shared Arm quantization-scope application logic (
fullvslinear) for TOSA/Ethos-U/VGF quantizers. - Plumbed Ethos-U
extra_flagsthrough the partitioner and quantizer setup.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| extension/llm/export/quantizer_lib.py | Adds ethosu_16a8w, adds Arm quantize-scope helper, and plumbs scope/flags into Arm quantizers. |
| extension/llm/export/partitioner_lib.py | Passes Ethos-U extra_flags through to EthosUCompileSpec. |
| extension/llm/export/config/llm_config.py | Adds ethosu_16a8w, introduces shared QuantizeScope, and adds Ethos-U extra_flags config. |
| examples/models/llama/export_llama_lib.py | Exposes ethosu_16a8w via CLI and wires Arm quantize-scope + Ethos-U flags into quantizer/partitioner creation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| compile_spec = EthosUCompileSpec( | ||
| target, | ||
| system_config, | ||
| memory_mode, | ||
| extra_flags=extra_flags, | ||
| ) |
| "vulkan_8w", | ||
| "tosa_8a8w", | ||
| "ethosu_8a8w", | ||
| "ethosu_16a8w", | ||
| "vgf_8a8w", | ||
| "vgf_16a8w", |
ethosu_8w16aPT2E quantization modequantization.quantize_scopehandling for Arm backendscc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani